Efficient permutation testing of variable importance measures by the example of random forests
نویسندگان
چکیده
Hypothesis testing of variable importance measures (VIMPs) is still the subject ongoing research. This particularly applies to random forests (RF), for which VIMPs are a popular feature. Among recent developments, heuristic approaches parametric have been proposed whose distributional assumptions based on empirical evidence. Other formal tests under regularity conditions were derived analytically. But these can be computationally expensive or even practically infeasible. problem also occurs with non-parametric permutation tests, are, however, distribution-free and generically applied any kind prediction model VIMP. Embracing this advantage, it use sequential p-value estimation reduce computational costs associated conventional tests. These high in case complex models. Therefore, RF's widely used VIMP (pVIMP) serves as practical relevant application example. The results simulation studies confirm theoretical properties that is, type-I error probability controlled at nominal level power maintained considerably fewer permutations needed compared testing. numerical stability methods investigated two additional studies. In summary, theoretically sound possible greatly reduced costs. Recommendations given. A respective implementation pVIMP provided through accompanying R package rfvimptest.
منابع مشابه
control of the optical properties of nanoparticles by laser fields
در این پایان نامه، درهمتنیدگی بین یک سیستم نقطه کوانتومی دوگانه(مولکول نقطه کوانتومی) و میدان مورد مطالعه قرار گرفته است. از آنتروپی ون نیومن به عنوان ابزاری برای بررسی درهمتنیدگی بین اتم و میدان استفاده شده و تاثیر پارامترهای مختلف، نظیر تونل زنی(که توسط تغییر ولتاژ ایجاد می شود)، شدت میدان و نسبت دو گسیل خودبخودی بر رفتار درجه درهمتنیدگی سیستم بررسی شده اشت.با تغییر هر یک از این پارامترها، در...
15 صفحه اولstudy of cohesive devices in the textbook of english for the students of apsychology by rastegarpour
this study investigates the cohesive devices used in the textbook of english for the students of psychology. the research questions and hypotheses in the present study are based on what frequency and distribution of grammatical and lexical cohesive devices are. then, to answer the questions all grammatical and lexical cohesive devices in reading comprehension passages from 6 units of 21units th...
investigation of effective parameters on the rigidity of light composite diaphragms (psscb) by fem
در این رساله با معرفی سقف های psscb متشکل از ترکیب ورق های فولادی ذوزنقه ای و تخته های سیمانی الیافی به عنوان سقف های پیش ساخته (سازگار با سیستم سازه ای قاب های فولادی سبک) به بررسی پارامترهای موثر بر صلبیت سقف، پرداخته می شود. در تحقیق حاضر ابتدا به مدل سازی دو نمونه سقف آزمایش شده، به روش اجزاء محدود با استفاده از نرم افزار تحلیلی abaqus ver 6.10 پرداخته شده است. نمونه های ساخته شده تحت اعما...
the role of task-based techniques on the acquisition of english language structures by the intermediate efl students
this study examines the effetivenss of task-based activities in helping students learn english language structures for a better communication. initially, a michigan test was administered to the two groups of 52 students majoring in english at the allameh ghotb -e- ravandi university to ensure their homogeneity. the students scores on the grammar part of this test were also regarded as their pre...
15 صفحه اولDependence of Variable Importance in Random Forests on the Shape of the Regressor Space Supplement to “ Variable Importance Assessment in Regression : Linear Regression Versus Random Forest ”
Figure: Averaged normalized importances for X1 from 100 simulated datasets (simulation process described below) for m=1,2,3,4 (left to right) with β1=(4,1,1,0.3) , corr(Xj,Xk)=ρ |j−k| with ρ=−0.9 to 0.9 in steps of 0.1 Grey line: true normalized LMG allocation; Black line: true normalized PMVD allocation : Variable importance (% MSE Reduction) from RF-CART; ×: Variable importance (% MSE Reducti...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computational Statistics & Data Analysis
سال: 2023
ISSN: ['0167-9473', '1872-7352']
DOI: https://doi.org/10.1016/j.csda.2022.107689